Overview

Background on IPMs, outline of salmonIPM

Model Structure

Process Model

Egg Deposition and Egg-to-Smolt Survival

The chum life cycle begins with spawning, egg incubation and fry emergence, and shortly thereafter the downstream migration of juveniles (which we refer to as smolts). In salmonIPM we can fit three alternative spawner-recruit functions to describe the expected relationship between spawner abundance \(S_{jt}\) and smolt abundance \(M_{jt}\) in brood year \(t\) in population \(j\): density-independent discrete exponential, Beverton-Holt, and Ricker:

\[ f \left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right) = \begin{cases} \alpha_{jt} S_{jt} & \text{exponential} \\ \dfrac{ \alpha_{jt} S_{jt} }{ 1 + \alpha_{jt} S_{jt} / A_{jt} M_{\text{max},j} } & \text{Beverton-Holt} \\ \alpha_{jt} S_{jt} \text{exp}\left(- \dfrac{ \alpha_{jt}S_{jt} }{ \text{exp}(1) A_{jt} M_{\text{max},j} } \right) & \text{Ricker} \end{cases} \]

We use a nonstandard parameterization of the Ricker in terms of maximum smolt production or “capacity” \(M_{\text{max}}\), corresponding to the mode of the function. This facilitates direct comparison with the Beverton-Holt (where \(M_{\text{max}}\) is the asymptote) and is better-identified by data than the standard parameterization based on per capita density dependence. \(M_{\text{max}}\) has units of density (fish per stream distance or area) and is then expanded to units of abundance based on habitat size \(A_{jt}\). In the case of Lower Columbia chum, this is km of spawning habitat estimated using GIS.

Intrinsic productivity \(\alpha_{jt}\) is calculated as a weighted mean of age-specific female fecundity \(\mu_{E,a}\), weighted by the spawner age distribution \(q_{jta}\), multiplied by the proportion of female spawners \(q_{\text{F},jt}\) (discounted for the proportion of females that are not “green”, i.e. not fully fecund, with discount rate \(\delta_\text{NG} \in [0,1]\)), and finally multiplied by the density-independent maximum egg-to-smolt survival \(\psi_j\):

\[ \alpha_{jt} = \psi_j q_{F,jt} \left[p_{\text{G},jt} + \delta_\text{NG} \left(1 - p_{\text{G},jt} \right) \right] \sum_{a=3}^{5} q_{jta} \mu_{E,a}. \]

The discount for partially spawned females is only relevant for one population, Duncan Channel, a constructed spawning channel in which some translocated females are believed to have already deposited eggs elsewhere. We assume the proportion of “green” females \(p_{\text{G},jt}\) is known without error since translocated spawners are individually handled and visually assessed.

This formulation implicitly assumes egg deposition is density-independent while egg-to-smolt survival is density-dependent, resulting in realized survival \(< \psi_j\) as spawner density increases. Maximum egg-to-smolt survival varies randomly among populations according to the hyperdistribution

\[ \text{logit}(\psi_j) \sim N(\mu_\psi, \sigma_\psi). \]

Maximum smolt production varies randomly among populations according to the hyperdistribution

\[ \text{log}(M_{\text{max},j}) \sim N(\mu_{M_\text{max}}, \sigma_{M_\text{max}}). \]

Smolt Recruitment Process Error

Given the expected smolt production, realized smolt production (the unknown “true state”) is lognormally distributed with process errors that include a common ESU-level autoregressive trend \(\eta^\text{year}_{M,t}\) with first-order autocorrelation coefficient \(\rho_{M}\) and innovation SD \(\sigma^\text{year}_{M}\), plus unique independent shocks \(\epsilon_{M,jt}\):

\[ \begin{aligned} M_{jt} &= f \left( S_{jt} | \alpha_{jt}, M_{\text{max},j}, A_{jt} \right) \, \text{exp}( \eta^\text{year}_{M,t} + \epsilon_{M,jt} ) \\ \eta^\text{year}_{M,t} &\sim N(\rho_{M} \eta^\text{year}_{M,t-1}, \sigma^\text{year}_{M}) \\ \epsilon_{M,jt} &\sim N(0, \sigma_{M}). \end{aligned} \]

In salmonIPM we can include covariates of the parameters \(\psi_j\) and \(M_{\text{max},j}\) as well as the smolt recruitment process error term. We do not consider this option further here, but we anticipate incorporating environmental covariates in future development of the Lower Columbia chum IPM.

Smolt-to-Adult Survival

Smolt-to-adult survival (SAR, \(s_{MS}\)) is assumed to be density-independent. The SAR process model for outmigrant cohort \(t\) in population \(j\) is logistic normal, including a common ESU-level first-order autoregressive trend \(\eta^\text{year}_{MS,t}\) with first-order autocorrelation coefficient \(\rho_{MS}\) and innovation SD \(\sigma^\text{year}_{MS}\), plus unique independent shocks \(\epsilon_{MS,jt}\):

\[ \begin{aligned} \text{logit}( s_{MS,jt} ) &= \text{logit}( \mu_{MS} ) + \eta^\text{year}_{MS,t} + \epsilon_{MS,jt} \\ \eta^\text{year}_{MS,t} &\sim N(\rho_{MS} \eta^\text{year}_{MS,t-1}, \sigma^\text{year}_{MS}) \\ \epsilon_{MS,jt} &\sim N(0, \sigma_{MS}). \end{aligned} \] As with the smolt recruitment parameters, salmonIPM can accommodate spatiotemporally varying covariates of \(s_{MS}\), but we do not discuss this further here.

Conditional Age-at-Return

Adult age structure is modeled by defining a vector of conditional probabilities, \(\mathbf{p}_{jt} = [p_{3jt}, p_{4jt}, p_{5jt}] ^ \top\), where \(p_{ajt}\) is the probability that an outmigrant in year \(t\) in population \(j\) returns at age \(a\), given that it survives to adulthood. The unconditional probability is given by \(s_{MS,jt} p_{ajt}\), where both SAR and \(p_a\) are functions of underlying annual marine survival and maturation probabilities that are nonidentifiable without some ancillary data. This parameterization resolves the nonidentifiability.

The conditional age probabilities follow a logistic normal process model with hierarchical structure across populations and through time within each population. The additive log ratio,

\[ \text{alr}(\mathbf{p_{jt}}) = \left[ \text{log} \left( \dfrac{p_{3jt}}{p_{5jt}} \right), \text{log} \left( \dfrac{p_{4jt}}{p_{5jt}} \right) \right] ^ \top \]

has a hierarchical bivariate normal distribution with population-level random effects \(\boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j}\) and unique residuals \(\boldsymbol{\epsilon}_{\mathbf{p}, jt}\):

\[ \begin{aligned} \text{alr}(\mathbf{p_{jt}}) &= \text{alr}(\boldsymbol{\mu}_\mathbf{p}) + \boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j} + \boldsymbol{\epsilon}_{\mathbf{p}, jt} \\ \boldsymbol{\eta}^\text{pop}_{\mathbf{p}, j} &\sim N(\mathbf{0}, \boldsymbol{\Sigma}^\text{pop}_\mathbf{p}) \\ \boldsymbol{\epsilon}_{\mathbf{p}, jt} &\sim N(\mathbf{0}, \boldsymbol{\Sigma}_\mathbf{p}). \end{aligned} \]

Here the 2 \(\times\) 2 covariances matrices \(\boldsymbol{\Sigma}^\text{pop}_\mathbf{p}\) and \(\boldsymbol{\Sigma}_\mathbf{p}\) allow correlated variation among age classes (on the unconstrained scale, not merely due to the simplex constraint on \(\mathbf{p}\)) across populations and through time within a population, respectively. For example, some populations or cohorts may skew overall younger or older than average. We parameterize each covariance matrix by a vector of standard deviations and a correlation matrix:

\[ \begin{aligned} \boldsymbol{\Sigma}^\text{pop}_\mathbf{p} &= \boldsymbol{\sigma}^\text{pop}_\mathbf{p} \mathbf{R}_\mathbf{p}^\text{pop} { \boldsymbol{\sigma}^\text{pop}_\mathbf{p} } ^ \top \\ \boldsymbol{\Sigma}_\mathbf{p} &= \boldsymbol{\sigma}_\mathbf{p} \mathbf{R}_\mathbf{p} \boldsymbol{\sigma}_\mathbf{p} ^ \top . \end{aligned} \]

Sex Structure

Adult sex structure is modeled as the conditional probability \(p_{\text{F},jt}\) that an outmigrant from population \(j\) in year \(t\) is female, given that it survives to adulthood. The proportion of females follows a process model that includes normally distributed population-specific random effects \(\eta^\text{pop}_{\text{F}}\) with hyper-SD \(\sigma^\text{pop}_\text{F}\) and unique residuals \(\epsilon^\text{pop}_{\text{F}}\) with SD \(\sigma_\text{F}\) around the hyper-mean \(\mu_\text{F}\).

\[ \begin{aligned} \text{logit}( p_{\text{F},jt} ) &= \text{logit}( \mu_\text{F} ) + \eta^\text{pop}_{\text{F},j} + \epsilon_{\text{F},jt} \\ \eta^\text{pop}_{\text{F},j} &\sim N(0, \sigma^\text{pop}_\text{F}) \\ \epsilon_{\text{F},jt} &\sim N(0, \sigma_\text{F}) \end{aligned} \]

Adult Recruitment

Survival to adults at age, broodstock removal and translocated spawners assumed known, harvest assumed to be zero for now

\[ S_{\text{W}, jt} = \left(\sum_{a=3}^{5} s_{MS,j,t-a} \hspace{0.1cm} p_{aj,t-a} \hspace{0.1cm} M_{j,t-a} \right) - B_{jt} + S^\text{add}_{jt} = \left(\sum_{a=3}^{5} \tilde{S}_{\text{W}, ajt} \right) - B_{jt} + S^\text{add}_{jt} \]

Wild vs. hatchery spawners

\[ S_{\text{H},jt} = \dfrac{ S_{\text{W},jt} \hspace{0.1cm} p_{\text{HOS},jt} } { (1 - p_{\text{HOS},jt}) } \]

Total spawner abundance is then \(S_{jt} = S_{\text{W},jt} + S_{\text{H},jt}\). Spawner age structure is given by \(\mathbf{q}_{jt} = [q_{3jt}, q_{4jt}, q_{5jt}]\), where \(q_{ajt} = \tilde{S}_{\text{W},ajt} / S_{jt}\). Spawner sex structure is given by the age-weighted average of female proportions from the respective outmigrant cohorts: \(q_{\text{F},jt} = \sum_{a} {q_{ajt} \hspace{0.1cm} p_{\text{F},j,t-a}}\).

Observation Model

Fecundity

We modeled observations of fecundity from individual female chum salmon collected at hatcheries. The likelihood for the fecundity of female \(i\) of age \(a\) is a zero-truncated normal with age-specific mean and SD.

\[ E_{a,i}^\text{obs} \sim N_{+}(\mu_{E,a}, \sigma_{E,a}) \]

Smolt and Spawner Abundance

Informative priors based on Bayesian observation models applied to field data of various kinds

\[ \begin{aligned} \text{log}(M_{jt}) &\sim N(\mu_{M,jt}, \tau_{M,ij}) \\ \text{log}(S_{jt}) &\sim N(\mu_{S,jt}, \tau_{S,ij}) \end{aligned} \]

Some prior observation error SDs are missing or unknown, and so were imputed by fitting a lognormal hyperdistribution to the known SDs

\[ \begin{aligned} \text{log}(\tau_{M,ij}) &\sim N( \mu_{\tau_M}, \sigma_{\tau_M}) \\ \text{log}(\tau_{S,ij}) &\sim N( \mu_{\tau_S}, \sigma_{\tau_S}) \end{aligned} \]

Spawner Age Composition

Age composition of wild spawners \(\mathbf{n}_{ajt}^\text{obs} = [n_{3jt}^\text{obs}, n_{4jt}^\text{obs}, n_{5jt}^\text{obs}] ^\top\) is assumed to follow a multinomial likelihood with the expected proportions given by the unobserved true state

\[ \mathbf{n}_{ajt}^\text{obs} \sim \text{Multinomial} \left( \sum_a n_{ajt}^\text{obs}, \mathbf{q}_{jt} \right) \]

Spawner Rearing Type

Hatchery/wild composition of spawners

\[ n_{\text{H},jt}^\text{obs} \sim \text{Bin} \left( n_{\text{W},jt}^\text{obs} + n_{\text{H},jt}^\text{obs}, p_{\text{HOS},jt} \right) \]

Spawner Sex Composition

Proportion female spawners

\[ n_{\text{F},jt}^\text{obs} \sim \text{Bin} \left( n_{\text{M},jt}^\text{obs} + n_{\text{F},jt}^\text{obs}, q_{\text{F},jt} \right) \]

Priors

Table of hyperpriors

Setup and Data

Load the packages we’ll need…

options(device = ifelse(.Platform$OS.type == "windows", "windows", "quartz"))
options(mc.cores = parallel::detectCores(logical = FALSE) - 1)

library(salmonIPM)
library(rstan)
library(shinystan)
library(matrixStats)
library(Hmisc)
library(dplyr)
library(tidyr)
library(yarrr)
library(magicaxis)
library(viridis)
library(zoo)
library(ggplot2)
theme_set(theme_bw(base_size = 16))
library(here)

# load data
source(here("analysis","R","01_LCRchumIPM_data.R"))
# load plotting functions
source(here("analysis","R","03_LCRchumIPM_plots.R"))
# load saved stanfit objects
if(file.exists(here("analysis","results","LCRchumIPM.RData")))
  load(here("analysis","results","LCRchumIPM.RData"))

Read in and manipulate the data…

Let’s look at the first few rows of fish_data to see the format salmonIPM expects…

head(fish_data_SMS)

Retrospective Models

Fit two-stage spawner-smolt-spawner models and explore output…

We fit exponential, Beverton-Holt and Ricker models, but model comparison using LOO is not feasible, so here we focus on the Ricker.

LCRchum_Ricker <- salmonIPM(fish_data = fish_data_SMS, fecundity_data = fecundity_data,
                            ages = list(M = 1), stan_model = "IPM_LCRchum_pp", SR_fun = "Ricker",
                            log_lik = TRUE, chains = 3, iter = 1500, warmup = 500,
                            control = list(adapt_delta = 0.99, max_treedepth = 14))
print(LCRchum_Ricker, prob = c(0.05,0.5,0.95),
      pars = c("psi","Mmax","eta_year_M","eta_year_MS","eta_pop_p","mu_pop_alr_p","p","p_F",
               "tau_M","tau_S","p_HOS","B_rate","E_hat","M","S","s_MS","q","q_F","LL"), 
      include = FALSE, use_cache = FALSE)
Inference for Stan model: IPM_LCRchum_pp.
3 chains, each with iter=1500; warmup=500; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=3000.

                    mean se_mean    sd        5%       50%       95% n_eff Rhat
mu_E[1]          2591.57    0.64 44.93   2518.44   2591.13   2667.22  4936 1.00
mu_E[2]          2856.96    0.30 24.60   2817.14   2856.61   2897.67  6685 1.00
mu_E[3]          2868.99    1.11 73.02   2747.91   2869.97   2985.28  4357 1.00
sigma_E[1]        510.29    0.48 33.22    457.73    509.36    566.74  4833 1.00
sigma_E[2]        560.74    0.28 17.22    533.23    560.26    590.04  3829 1.00
sigma_E[3]        435.38    0.93 55.47    355.67    429.31    534.84  3584 1.00
delta_NG            0.57    0.01  0.24      0.17      0.58      0.95  1626 1.00
mu_psi              0.59    0.00  0.08      0.46      0.58      0.73   668 1.01
sigma_psi           0.40    0.01  0.26      0.05      0.37      0.87   756 1.00
mu_Mmax             7.28    0.02  0.58      6.44      7.24      8.27   675 1.00
sigma_Mmax          1.31    0.01  0.47      0.75      1.21      2.17  1016 1.00
rho_M               0.10    0.02  0.43     -0.64      0.13      0.75   519 1.00
sigma_year_M        0.46    0.00  0.12      0.30      0.45      0.68  1130 1.00
sigma_M             0.30    0.00  0.05      0.22      0.29      0.37   707 1.00
mu_MS               0.00    0.00  0.00      0.00      0.00      0.00  1515 1.00
rho_MS              0.49    0.01  0.22      0.10      0.52      0.80   717 1.00
sigma_year_MS       1.03    0.01  0.22      0.73      1.00      1.43  1017 1.00
sigma_MS            0.56    0.00  0.05      0.47      0.56      0.66   627 1.01
mu_p[1]             0.23    0.00  0.02      0.20      0.23      0.27   461 1.01
mu_p[2]             0.72    0.00  0.02      0.69      0.72      0.75   549 1.01
mu_p[3]             0.04    0.00  0.01      0.03      0.04      0.05   444 1.00
sigma_pop_p[1]      0.20    0.01  0.17      0.02      0.16      0.53   290 1.02
sigma_pop_p[2]      0.14    0.01  0.12      0.01      0.11      0.37   332 1.01
R_pop_p[1,1]        1.00     NaN  0.00      1.00      1.00      1.00   NaN  NaN
R_pop_p[1,2]        0.35    0.03  0.58     -0.78      0.53      0.98   529 1.00
R_pop_p[2,1]        0.35    0.03  0.58     -0.78      0.53      0.98   529 1.00
R_pop_p[2,2]        1.00    0.00  0.00      1.00      1.00      1.00  2848 1.00
sigma_p[1]          1.70    0.01  0.14      1.48      1.69      1.94   509 1.01
sigma_p[2]          0.87    0.00  0.09      0.72      0.86      1.03   565 1.01
R_p[1,1]            1.00     NaN  0.00      1.00      1.00      1.00   NaN  NaN
R_p[1,2]            0.75    0.00  0.06      0.64      0.76      0.85   695 1.01
R_p[2,1]            0.75    0.00  0.06      0.64      0.76      0.85   695 1.01
R_p[2,2]            1.00    0.00  0.00      1.00      1.00      1.00  3018 1.00
mu_F                0.50    0.00  0.02      0.47      0.50      0.52  1000 1.00
sigma_pop_F         0.19    0.00  0.07      0.09      0.18      0.31   770 1.00
sigma_F             0.38    0.00  0.04      0.32      0.37      0.44  1112 1.00
mu_tau_M            0.08    0.00  0.01      0.06      0.08      0.10  3418 1.00
sigma_tau_M         1.13    0.00  0.12      0.96      1.13      1.34  3193 1.00
mu_tau_S            0.11    0.00  0.01      0.10      0.11      0.12  2642 1.00
sigma_tau_S         0.98    0.00  0.06      0.89      0.98      1.08  2966 1.00
lp__           -41678.71    1.36 38.84 -41745.68 -41677.70 -41614.36   812 1.01

Samples were drawn using NUTS(diag_e) at Sun May 09 06:10:01 2021.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Plot estimated spawner-smolt production curves and parameters for the Beverton-Holt model.

Figure 1: Estimated Ricker spawner-recruit relationship (A, B) and intrinsic productivity (C) and capacity (D) parameters for the multi-population IPM. Thin lines correspond to each of 12 populations of Lower Columbia chum salmon; thick lines represent hyper-means across populations. In (A, B), each curve is a posterior median and the shaded region represents the 90% credible interval of the hyper-mean curve (uncertainty around the population-specific curves is omitted for clarity).

Here are the fits to the spawner data:

Figure 2: Observed (points) and estimated spawner abundance for Lower Columbia River chum salmon populations. Filled points indicate known observation error SD, while SD for open points is imputed. The posterior median (solid gray line) is from the multi-population IPM. Posterior 90% credible intervals indicate process (dark shading) and observation (light shading) uncertainty.

And here are the fits to the much sparser smolt data:

Figure 3: Observed (points) and estimated smolt abundance for Lower Columbia River chum salmon populations. Filled points indicate known observation error SD, while SD for open points is imputed. The posterior median (solid gray line) is from the multi-population IPM. Posterior 90% credible intervals indicate process (dark shading) and observation (light shading) uncertainty.

To understand how the IPM is imputing the observation error SD in cases where it is not reported, let’s look at the lognormal hyperdistribution fitted to the known SD values…

Figure 4: Lognormal hyperdistributions used to impute unknown smolt and spawner observation error SDs in the IPM. The posterior median (line) and 90% credible interval (shading) of the distribution fitted to the known SD values (histogram) are shown for each life stage.

We can also compare the estimated spawner age-frequencies to the sample proportions from the BioData. Age composition varies quite a bit across populations and through time, reflecting fluctuations in cohort strength.

Figure 5: Observed (points) and estimated spawner age composition for Lower Columbia River chum salmon populations. The posterior distribution from the multi-population IPM is summarized by the median (solid line) and 90% credible interval (shading). The error bar around each observed proportion indicates the 90% binomial confidence interval based on sample size.

Sex ratio

Proportion of hatchery-origin spawners

Forecasting

It is straightforward to use the IPM to generate forecasts of population dynamics…

Figure 6: Observed (points) and estimated spawner abundance for Lower Columbia River chum salmon populations, including 5-year forecasts. Filled points indicate known observation error SD, while SD for open points is imputed. The posterior median (solid gray line) is from the multi-population IPM. Posterior 90% credible intervals indicate process (dark shading) and observation (light shading) uncertainty.

Of course we could also look at forecasts of smolts, or any other state variable. Here are the 2020 forecasts of wild spawners for each population…